- Article
Feature-Enhanced Erroneous Outlier Detection in Hydrological Time Series Using Ensemble Methods
- Banujan Kuhaneswaran,
- Golam Sorwar and
- Feifei Tong
- + 1 author
Data quality issues in hydrological time series directly affect hydrological modelling applications, including flood forecasting and water resource management. A critical challenge in hydrological monitoring is distinguishing erroneous outliers caused by sensor malfunctions or data transmission errors from natural extreme events such as floods, which exhibit similar statistical characteristics but require opposite treatments in forecasting models. Current detection practices rely on generic algorithms without systematic validation or adaptation to hydrological temporal dependencies, limiting their effectiveness in operational contexts. This study addresses these gaps through a comprehensive framework for detecting erroneous outliers in daily hydrological time series. We engineered 19 features that capture temporal dependencies and hydrological patterns, and reduced them to six key features that capture raw measurements, temporal patterns, and hydrological dynamics. We evaluated 13 detection algorithms across three categories: statistical methods (e.g., Extreme Studentised Deviate and Hampel filter), ML approaches (e.g., Isolation Forest, and Local Outlier Factor), and feature-enhanced variants. Three data-driven ensemble strategies were developed: Accurate (maximising F1-score), Diverse (balancing performance with method diversity), and Fast (prioritising computational efficiency). By injecting controlled outliers into the recorded hydrological data from five-gauge stations (in the Tweed River catchment, Australia), the outlier detection framework was validated. The outcomes showed that the ensemble methods achieved satisfactory F1 scores (0.6–0.9) in detecting the erroneous outliers. Statistical testing also identified the top-performing detection algorithms. The framework developed in this paper provides a validated tool for quality control in hydrological analysis, with potential applications in drought monitoring and flood forecasting systems.
8 February 2026








![(a) Map showing the location of the Qaidam Basin (shaded in gray) on the northern margin of the Qinghai–Tibetan Plateau. (b) Distribution of the Mahai Basin (indicated by the red box) within the QB and the location of the study area (modified from [20]). (c) Hydrogeological cross-section from the Saishiteng Mountains to the Lenghu (LH) anticline belt (Yellow shading denotes confined groundwater, while green shading represents pore water in unconsolidated sediments. Red blocks indicate intrusive bodies, light grey blocks signify metamorphic rocks, and red lines represent faults.).](https://mdpi-res.com/cdn-cgi/image/w=281,h=192/https://mdpi-res.com/water/water-18-00443/article_deploy/html/images/water-18-00443-g001-550.jpg)

